Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Cross-corpus speech emotion recognition based on decision boundary optimized domain adaptation
Yang WANG, Hongliang FU, Huawei TAO, Jing YANG, Yue XIE, Li ZHAO
Journal of Computer Applications    2023, 43 (2): 374-379.   DOI: 10.11772/j.issn.1001-9081.2021122043
Abstract304)   HTML16)    PDF (3084KB)(123)       Save

Domain adaptation algorithms are widely used for cross-corpus speech emotion recognition. However, many domain adaptation algorithms lose the discrimination of target domain samples while pursuing the minimization of domain discrepancy, resulting in their presence at the decision boundary of the model in a high-density form, which degrades the performance of the model. Based on the above problem, a Decision Boundary Optimized Domain Adaptation (DBODA) method based cross-corpus speech emotion recognition was proposed. Firstly, the features were processed by using convolutional neural networks. Then, the features were fed into the Maximum Nuclear-norm and Mean Discrepancy (MNMD) module to maximize the nuclear norm of the sentiment prediction probability matrix of the target domain while reducing the inter-domain discrepancy, thereby enhancing the discrimination of the target domain samples and optimize the decision boundary. In six sets of cross-corpus experiments set up on the basis of Berlin, eNTERFACE and CASIA speech databases, the average recognition accuracy of the proposed method is 1.68 to 11.01 percentage points ahead of those of the other algorithms, indicating that the proposed model effectively reduces the sample density around the decision boundary and improves the prediction accuracy.

Table and Figures | Reference | Related Articles | Metrics
Speech deception detection algorithm based on denoising autoencoder and long short-term memory network
Hongliang FU, Peizhi LEI
Journal of Computer Applications    2020, 40 (2): 589-594.   DOI: 10.11772/j.issn.1001-9081.2019071183
Abstract492)   HTML1)    PDF (670KB)(340)       Save

In order to further improve the performance of speech deception detection, a speech deception detection algorithm based on Denoising AutoEncoder (DAE) and Long Short-Term Memory (LSTM) network was proposed. Firstly, a parallel structure of DAE and LSTM was constructed, namely PDL (Parallel connection of DAE and LSTM). Then, artificial features in the speech were extracted and put into the DAE to obtain more robust features. Simultaneously, the Mel spectrums extracted after adding windows to the speech and framing were input into LSTM frame-by-frame for frame-level depth feature learning. Finally, these two types of features were merged by the fully connected layer and the batch normalization, and the softmax classifier was used for the deception recognition. The experimental results on the CSC (Columbia-SRI-Colorado) corpus and the self-built corpus show that the recognition accuracy of the classification with fusion feature is 65.18% and 68.04% respectively, which is up to 5.56% and 7.22% higher than those of other algorithms, indicating that the proposed algorithm can effectively improve the accuracy of deception recognition.

Table and Figures | Reference | Related Articles | Metrics